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1.  Introduction 


This  report  considers  the  problem  of  selling  an  asset  on  the  open 
market.  As  the  seller  waits  for  a  good  offer,  he  or  she  receives  a 
random  (both  in  time  and  in  magnitude)  sequence  of  offers.  After  each 
offer  is  received,  the  seller  must  decide  whether  or  not  to  sell,  weigh¬ 
ing  the  possibility  of  obtaining  a  better  offer  against  the  cost  of 
continuing  to  wait.  Successive  offers  are  independent  random  variables 
with  a  common  distribution  F  having  finite  mean  and  variance.  There 
is  a  cost  c  >  0  incurred  for  each  unit  of  time  the  asset  remains 
unsold.  We  will  consider  three  alternative  assumptions  about  the  timing 
of  offers.  The  first  is  that  offers  arrive  with  a  fixed,  known,  period¬ 
icity  (periodic  offer  rate) .  The  second  is  that  each  period  there  is  a 
fixed  but  unknown  probability  of  receiving  an  offer,  which  is  independent 
of  the  size  of  the  offer  (geometric  offer  rate) .  The  third  is  that  the 
times  between  offers  are  independent,  identically  distributed  random 
variables,  with  a  known  distribution  (random  offer  rate) .  Under  the 
first  two  assumptions,  offers  can  only  be  accepted  at  the  start  of  a 
period.  Under  the  third  assumption,  any  offer  still  in  force  can  be 
accepted  at  any  time.  The  objective  is  to  maximize  the  total  expected 
net  revenue  from  the  search  and  sale.  We  will  consider  both  the  case 
where  only  the  most  recently  obtained  offer  may  be  accepted  (no  recall) , 
and  the  case  where  any  previously  received  offer  may  be  accepted  (recall 
allowed) .  Also,  we  will  consider  both  finite  and  infinite  time  horizons. 

The  selling  problem  is  one  of  several  related  problems,  the  most 
general  of  which  is  that  of  optimally  acquiring  and  divesting  assets. 

This  latter  problem,  however,  involves  a  sequence  of  ongoing  decisions, 
whereas  the  selling  problem  involves  only  a  one-time  stopping  decision. 
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The  most  closely  related  problem  to  the  selling  problem  is  that  of  buying 
an  asset.  The  buying  problem,  however,  is  less  general  in  that  offers 
(price  quotes)  are  assumed  to  arrive  periodically.  In  the  selling  problem 
the  scenarios  of  geometric  and  random  offer  rates  are  important,  because 
the  seller  usually  must  wait  passively  for  offers  to  arrive,  whereas  the 
buyer  may  actively  search  for  the  best  price.  The  timing  of  offers  is 
thus  an  important  aspect  of  this  report. 

The  selling  problem  arises  in  many  contexts  in  addition  to  selling  an 
asset.  One  is  successively  interviewing  candidates  for  a  job  (the  secre¬ 
tary  problem) .  Another  is  in  quality  assurance,  where  one  sequentially 
inspects  items  from  a  population  to  find  one  with  an  acceptable  measure  of 
quality.  In  this  context  a  periodic  offer  rate  would  apply  when  the  inspec¬ 
tion  process  is  regular,  with  a  fixed  cost  or  time  per  inspection.  A  geo¬ 
metric  offer  rate  would  apply  when  there  is  a  fixed  but  unknown  probability 
that  a  given  inspection  will  not  yield  conclusive  results  and  will  have  to 
be  repeated.  A  random  offer  rate  would  apply  when  the  inspection  process 
itself  is  irregular,  or  when  the  next  time  to  be  inspected  must  first  be 
located,  the  locating  process  taking  a  random  amount  of  time. 

A  number  of  authors  have  established  the  existence  and  properties  of 
optimal  search  policies  when  the  price  distribution,  F  ,  is  known  and 
the  offer  rate  is  periodic: 

1)  Optimal  stopping  rules  have  been  shown  to  exist,  both  when 
recall  is  allowed  and  under  no  recall.  This  result  requires 
only  the  hypothesis  that  max{Z,0}  have  a  finite  mean  and 
variance,  where  Z  ~  F  . 

2)  The  optimal  stopping  rule  is  characterized  by  a  reservation 
price,  i.e.,  a  price  R,  possibly  dependent  upon  the  number 
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of  periods  remaining  or  originally  available,  such  Chat  the 
seller  should  continue  to  wait  if  and  only  if  the  current 
best  available  offer  is  less  than  R  . 

3)  The  infinite-horizon  optimal  expected  net  return  exists  and 
is  the  limit  of  the  finite-horizon  optimal  expected  returns. 

4)  The  optimal  stopping  rule  when  recall  is  allowed  is  myopic, 
i.e.,  the  same  optimal  policy  is  arrived  at  if,  regardless 
of  the  number  of  periods  actually  remaining,  one  acts  as  if 
only  one  period  remains.  (In  this  case  the  reservation  price 
is  independent  of  the  time  horizon.) 

5)  When  recall  is  allowed  the  optimal  policy  never  accepts  a 
previously  passed-by  offer  except  possibly  in  the  last  period. 

6)  In  the  case  of  no  recall,  the  finite-horizon  reservation 
prices  converge  to  the  infinite-horizon  reservation  price. 

This  report  will  investigate  conditions  under  which  the  above 
properties  hold  when  the  distribution  of  offers  is  unknown  and  the 
seller's  prior  distribution  of  offers  undergoes  a  Bayesian  updating 
as  successive  offers  are  received.  We  will  also  examine  the  effects 
of  the  alternative  assumptions  on  the  timing  of  offers.  For  previous 
work  on  this  and  related  problems,  see  Albright  [2],  DeGroot  [4],  Derman, 
Lieberman,  and  Ross  [5],  Kohn  and  Shavell  [9],  Rothschild  [12],  and 
Telser  [15].  In  general,  however,  only  limited  results  have  been 
obtained  regarding  the  form  of  optimal  policies. 

In  the  recall-allowed  case,  the  main  objective  of  this  report  is  to 
investigate  the  efficiency  and  properties  of  myopic  policies.  (Myopic 
policies  have  also  been  investigated  by  Chow  and  Robbins  [3],  Abdel- 
Hameed  [1],  and  Pratt,  Wise,  and  Zeckhauser  [11].)  For  the  no-recall 


case,  the  objective  is  to  derive  conditions  which 


ensure  the  optimal  policy  is 
reservation  (characterized  by  a  reservation  price).  For  the  related  buy¬ 
ing  problem,  see  Rothschild  [12] ,  and  Rosenfield  and  Shapiro  [13]. 

In  Section  2,  we  illustrate  the  greater  complexity  of  optimal  poli¬ 
cies  when  the  distribution  of  offers  is  unknown.  In  Sections  3  and  4, 
we  consider  the  cases  of  recall  allowed  and  no  recall,  respectively, 
assuming  a  periodic  offer  rate.  In  Section  5,  we  investigate  extensions 
to  geometric  and  random  offer  rates. 

?.  Some  Counterexamples 

When  the  distribution  of  offers  is  unknown  and  the  seller  updates 
his  or  her  beliefs  about  it  after  each  offer  is  observed,  many  of  the 
properties  established  for  the  known-distribution  case  may  fail  to  hold 
unless  additional  conditions  are  imposed.  This  is  so  even  in  the 
simplest  case  of  a  periodic  offer  rate.  Consider  the  following 
examples. 

Example  2.1  -  Recall  Allowed  There  exist  two  possible  offer  distri¬ 
butions.  Each  concentrates  its  mass  on  two  offers,  the  first  on  $400 
and  $600,  the  second  on  $600  and  $800.  Within  each  distribution,  the 
lower  offer  is  nine  times  as  likely  as  the  higher  offer.  The  cost  per 
offer  is  $12.  Suppose  that,  a  priori,  the  first  distribution  is  nine 
times  as  likely  to  prevail  as  the  second,  and  that  the  first  offer 
received  is  $600.  Then  a  posteriori,  the  two  price  distributions  are 
equally  likely.  With  one  period  to  go,  it  is  optimal  to  stop  and 
receive  a  net  revenue  of  $588,  since  the  alternative  of  continuing  has 
an  expected  return  of  $586.  An  analysis  of  the  problem  with  two  periods 
to  go  is  diagrammed  below. 
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The  optimal  policy  stops  if  an  offer  of  $800  is  received,  since 
it  is  the  maximum  offer  possible.  Also,  if,  after  receiving  an  offer 
of  $600,  an  offer  of  $400  is  subsequently  observed,  the  optimal  policy 
stops  and  recalls  the  $600  offer,  since  the  $400  offer  implies  the 
first  price  distribution  is  the  one  that  prevails.  With  one  period 
to  go,  the  expected  value  for  continuing,  having  observed  $600  twice, 
is  $582.  The  net  revenue  from  stopping  is  $576,  and  so  the  optimal 
policy  continues.  With  two  periods  to  go,  the  expected  net  revenue 
from  continuing,  having  observed  $600  once,  is  $589.  The  net  revenue 
from  stopping  is  $588,  and  so  the  optimal  policy  continues.  Since  the 
one-period  look-ahead  analysis  said  to  stop,  the  optimal  policy  is  not 
myopic.  Also,  the  optimal  policy  is  not  characterized  by  a  reservation 
price  since  it  continues  after  receiving  a  $600  offer  but  stops  after 
a  $400  offer.  Finally,  the  optimal  policy  sometimes  stops  and  recalls 
a  previous  price  (if  a  $600  offer  is  followed  by  a  $400  offer)  before 
the  final  period;  this  never  happens  if  the  price  distribution  is  known. 
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Example  2.2  -  No  Recall  (due  to  Rothschild  [12])  There  exist  two 
possible  offer  distributions.  The  first  is  degenerate  at  $3;  the 
second  concentrates  its  mass  on  $4  and  $5,  with  $5  being  far  more 
likely.  If  an  offer  of  $3  is  received,  the  optimal  policy  stops; 
but  if  an  offer  of  $4  is  received,  the  optimal  policy  continues 
since  the  likelihood  of  soon  observing  $5  is  high.  Thus,  no  reser¬ 
vation  price  exists. 

3.  Unknown  Price  Distribution  with  Recall  Allowed 

We  consider  in  this  section  and  the  next  the  case  of  a  periodic 

rate  of  one  offer  per  unit  of  time.  Let  F(p|p.)  be  the  forecasting 

distribution  of  the  next  offer,  given  the  vector  p_ =  (p1  , . . .  ,  PN) 

of  previously  received  offers.  Let  z(a)  =  max{p  }  denote  the  best 

i<N  1 

offer  received  so  far,  and  Vt(p)  denote  the  maximum  expected  net 
return  given  a  history  of  offers  p.  and  a  t-period  horizon.  The 
recursive  relationship  which  characterizes  the  finite-horizon  maximum 
expected  net  return  is 


V^CpJ  *  max{z(jL)  ,  -c  +  j*  Vt_^  (p.,p)  dF(p|p.)}  . 


(3.1) 


Following  a  myopic  policy,  the  stopping  condition  is  obtained  from  a 
one-period  look  ahead: 

Vl(c3  *  max{z(p)  ,  -c  +  /  (z(p.)  v  p)dF(p  |pi) }  . 

The  stopping  condition  for  this  one-period  look-ahead  policy  is 

z(p)  >-c  +  /  (z(p)  v  p)dF(p|®3  (3.2) 


/ 


or,  equivalently 
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where 


c  >  G(z (p)  |p)  , 


(3.3) 


G(x|jO 


F(p|p)dp  , 


where 


F(p!&)  »  1  -  F(p|a)  . 


If  (3.3)  determines  the  optimal  stopping  rule,  then  the  optimal  policy 
myopic.  A  sufficient  condition  for  this  to  occur  is  as  follows. 

Theorem  1  If  G(z(ij)|e)  never  crosses  c  from  below  with  additional 
observations,  then  the  optimal  policy  (finite  and  infinite  horizons) 
is  myopic. 


Proof.  The  finite-horizon  proof  is  by  induction  on  the  number  of 
periods,  t  .  First  suppose  that  (3.3)  (or  equivalently  (3.2))  holds 
for  p. .  Then  by  the  hypothesis  of  the  theorem,  (3.3)  will  hold  for 
(p ,  PN+i)  »  and  so  a  one-period  look-ahead  criterion  will  say  stop. 
Thus,  by  the  induction  hypothesis 


Vl^’W  =  Z(*>  V  PN+1 


and  with  t-1  periods  to  go  it  is  optimal  to  stop.  In  this  case, 
the  optimal  stopping  criterion  for  the  t-period  problem  is 

-  c  + J*  CE,p)  df(p|j3)-z(£)”-c+  J*  vp)dF(p|p)-z(p)  <  0  , 
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and  so  stopping  is  also  optimal  for  the  t-period  problem.  If  (3.3)  does 
not  hold,  then 

-  c  +  J 'v  t_i  (a,p)  dF  (p  |.jj)-zCe)>_-c+  J"  (.  z(a)  v  p)dF(p  |a)  -  z(a)  _>  0  , 
and  it  is  optimal  to  continue. 

The  infinite-horizon  optimal  net  return  V  is  the  limit  of  the 
finite-horizon  returns,  so  if  (3.3)  holds 


V  -  lim  Vt  -  z  (a)  v  PN+1  , 
t-*"00 


and  the  optimal  return  is  obtained  by  stopping.  As  in  the  finite- 
horizon  proof,  when  (3.3)  does  not  hold 

-  c  +  J' v  (p.,p)  dm  (p |a)  -  z(a)  >.  -  c  +  J  (z(a)vp)dF(p|a)  -  z(j>)  >.  o  , 

and  it  is  optimal  to  continue. 


The  quantity  G(z(pl  Ip)  is  the  expected  gain  from  one  more 
offer.  As  Theorem  1  shews,  if  this  quantity  never  crosses  c  from 
below  with  additional  observations,  then  as  soon  as  the  expected  gain 
from  one  more  observation  becomes  less  than  the  cost  to  obtain  it, 
the  expected  gain  from  any  number  of  additional  observations  will  nt  t 
cover  their  cost.  Thus  a  myopic  policy  prevails.  Without  this  kind 
of  condition  on  the  behavior  of  G(z(p)|  q.)  ,  a  one-period  look-ahead 
analysis  may  be  inadequate. 

Corollary  1.  If  G(z(p)|  p^  ,...,p  )  depends  only  upon  N  and  z(p), 
and  is  nonincreasing  in  each,  then  the  optimal  policy  is  myopic  and  is 
characterized  by  a  reservation  price.  Furthermore,  the  reservation 
price  is  decreasing  in  N  . 

Proof.  By  the  mcno tonicity  of  G(z(p)|p)  in  N  and  z(p),  G(z(p)|p) 
decreases  with  additional  observations.  Thus  by  Theorem  1,  the  optimal 
policy  is  myopic.  The  optimal  policy  is  determined  by  the  sign  of 

G(z(p) | pi  -  c 

Let  R^  =  sup{p:G(p|p^  ,  ...,  pN)  -  c  >_  0}  .  Since  G  is  nonincreasing 
in  z(p)  one  should  stop  if  and  only  if  z(p)  R^  ,  and  since  G  is 
nonincreasing  in  N  ,  the  reservation  price  R^  is  nonincreasing  in  N  . 

The  monotonicity  in  N  of  the  reservation  price  can  lead  to 
situations  where  the  reservation  price  becomes  larger  than  the  best 
possible  price  and  therefore  to  situations  where  one  should  obtain  one 
more  offer  then  stop  regardless  of  what  it  is.  This  will  be  illustrated 
in  an  example  to  follow.  First,  however,  we  derive  another  corollary 


of  Theorem  1. 
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Corollary  2.  Suppose  offers  are  distributee  as  F(p+t)  ,  where  t 
is  an  unknown  translation,  and  that  the  posterior  distribution  for 
t  is  a  function  h(t|z(p),N)  ,  which  depends  upon  z(g)  and  N  only. 
Suppose  also  that  for  z(p)  fixed,  h(t|z(a),N)  has  monotone  likeli¬ 
hood  ratio  (MLR)  in  t  relative  to  ,  and  for  N  fixed, 
h(u-z(ji)  jz(ji)  ,  N)  has  MLR  in  u  relative  to  z(pj  .  Then  the  optimal 
policy  is  myopic  with  nonincreasing  reservation  prices. 


Proof .  We  establish  the  result  by  showing  that  the  conditions  of 
Corollary  1  are  satisfied. 


/CO  /*» 

F(p|jL)dp  =  /  /  F(p+t)h(t  |  z(jl)  ,N)dtdp 

’u)  *z(ji) 


f 


G(z(p.)+t)h(t|z(p)  ,  N)dt 


Since  G(zOfO]fi)  depends  only  upon  z(pj  and  N  ,  all  that  remains 
to  show  is  its  monotonicity.  It  can  be  shown  that  if  a  density 
f(x|9)  has  MLR  in  x  relative  to  9  and  if  y(x)  is  monotonic 
in  x  ,  then 

f y (x) f (x | 6) dx 

in  monotonic  in  9  in  the  same  direction  as  y  .  Applying  this  result 
to  the  above  expression  for  G(z(p.)  |p.)  yields  that  G(z(p.)  |  p^,  ...  p^) 
is  nonincreasing  in  N  .  Also , 


A  density  f(x|e)  has  monotone  likelihood  ratio  in  x  relative 
to  9  if  and  only  if  for  all  0^  >  9^  ,  f (x| 0  ) /f (x j ©2)  is  non- 
decreasing  in  x  . 


G(z(ft)|fO  ■  f  G(u)h(u-z(ft)  |z({0  »  N)du  , 

so  applying  Che  result  once  more  yields  that  G(z(p.)jp.)  is 
nonincreasing  in  z(jJ  .  Q 

Example  3.1  Multinomial  Distribution  with  a  Dirichlet  Prior 

There  exist  m  possible  offers  p,  ,  . . .  ,  p  .  The  Dirichlet 

1  m 

prior  is  characterized  by  parameters  N,  ,  . . .  ,  N  which  are  analogous 

1  m  “ 

to  the  frequencies  of  these  offers.  The  probability  of  observing 
offer  p^  is  N^/N  ,  where  N  »  EN^  .  After  observing  an  offer  p^  , 
the  prior  distribution  is  updated  by  incrementing  by  one  ,  ef¬ 
fectively  increasing  the  probability  of  observing  p^  somewhat  and 

decreasing  the  probability  of  the  other  offers  slightly. 

Since 

Pm-1  Pm 

G(zCji)  |p  ,  ...  ,  p  )  =  l  (pk+1-pJ  l  N,/N  (3. 

k=z( ji)  j=k+l  J 


it  follows  that  the  multinomial  distribution  with  a  Dirichlet  prior 
satisfies  the  hypothesis  of  Corollary  1  .  Thus  the  optimal  policy  is 
myopic  and  is  characterized  by  a  sequence  of  reservation  prices  which 
is  nonincreasing  in  N  .  To  illustrate  the  one-more-offer-then-stop 
phenomenon,  take  m  =  3  ,  =  3  ,  N2  =  2  ,  *  1  ,  P^  *  7  ,  p2  *  14  , 

p^  ■  21  ,  c*3.75  .  If  the  first  offer  is  $7  we  continue;  but  whatever 
offer  is  next  received  we  stop,  since  G(k|7,k)  <  c  for  k-7,  14  , 


Example  3.2.  Exponential  Distribution  with  an  Unknown  Translation 
Let  X  be  a  random  variable  having  an  exponential  distribution 
H(x)  ■  1  -  e  X^t+x^  with  an  unknown  translation  t  .  Let 
Z  «*  -  2t  -  X  .  This  operation  re-orients  the  exponential 
distribution,  putting  its  mass  on  (—«*»,—  t ]  instead  of  [  -  t  ,  °°)  . 
The  distribution  of  an  offer  Z  is  then  F(z)  *  gX(t+z)  (t  +  2  ^  o)  . 
Such  a  distribution  may  be  realistic  because  often  the  seller  does 
not  know  of  an  explicit  upper  bound  on  offers,  but  may  believe  that 
offers  are  clustered  near  the  unknown  upper  bound. 

Let  r(t)  be  the  prior  density  on  the  unknown  translation,  and 
assume  that  r(*)  is  log-concave.  Then  the  posterior  density  for  t 
can  be  written  as 

}  ,  r(t)xNexp4(pi+t))I{z(p.)+  t  <o} 

y*r(u)XNexp(  l  <Pi+u))I{z(p0  +  u  <  0}du 

*  K(z^)r(t)eXNt  I{z(a)  +  t<0} 

where  K(zCp.))  is  a  function  which  depends  upon  z(p.)  only  .  Thus 
the  posterior  density  of  t  can  be  written  as  a  function  of  z(c) 
and  N  .  For  a  fixed  value  of  z(fl.)  and  N  ^  M  , 

h(t  |  z(jl)  ,  N)  X(N-M)t 

-  s  e 

h(t  |  z(p.)  ,  M) 


Therefore  h(t|z(pL)  ,  N)  has  MLR  in  t  relative  to  N.  For  fixed 
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h(u  -  y 

y  ,  N) 

h(u  -  x 

x  ,  N) 

C(x,y) 


£(u.--Z2 

r(u-x) 


where  C(x,y)  depends  upon  x  ,  y  and  N  ,  but  not  u  .  By  the 
log-concavity  of  r(*)  ,  r(u-y)/r(u-x)  is  nondecreasing  in  u  ,  and 
so  h(u-2(ji)  (z(p3  ,N)  has  MLR  in  u  relative  to  z(jt)  .  Thus  the 
conditions  of  Corollary  2  are  satisfied  and  the  optimal  policy  is 
myopic  with  nonincreasing  reservation  prices. 

Example  3.3  Normal  Distribution  with  Unknown  Mean 

Assume  that  offers  are  normally  distributed  with  unknown  mean  p 
and  variance  1  ,  and  assume  the  prior  distribution  of  p  is  itself 
normal  with  mean  p^  and  variance  I/t^  .  Let  <5>  and  <p  be  the  distribu¬ 
tion  and  density  functions  of  a  standard  normal  random  variable.  This 
example  does  not  satisfy  the  hypothesis  of  either  corollary.  However, 
as  shown  in  DeGroot  [4],  if 

2  >  T0  *  2 
c  —  (Tq  +  1)  2tt 


then  the  optimal  policy  is  myopic.  Furthermore,  the  stopping  criterion 
is 


zCa) 


uCeJ  > 


4>  1(c[t/(t+1)]'*) 
[t/(t+1)]* 


where  p(^)  is  the  posterior  mean,  x  *  Tq  +  N  is  the  posterior 
precision  (reciprocal  of  variance) ,  and 


Mx)  • 


[1  -  $(z)]dz  *  <)>(x)  -  x[l-<t(x)]  • 
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The  optimal  policy  for  this  example  may  not  be  reservation.  The  search 
may  terminate  not  only  upon  receiving  a  very  high  offer,  but  also  a 
very  low  one.  (In  the  latter  case  the  seller  would  realize  offers  are 
much  lower  than  originally  thought.) 

To  summarize,  Theorem  1  roughly  says  that  for  the  optimal  policy 
to  be  myopic,  observing  a  low  offer  must  cause  the  mass  associated  with 
high  offers  to  decrease.  Distributions  with  origin-related  (or,  in 
some  cases,  mean-related)  unknown  parameters  typically  have  this  prop¬ 
erty.  In  general,  one  would  not  expect  to  find  myopic  optimal  policies 
associated  with  distributions  with  unknown  spread-related  parameters. 

For  example,  it  can  be  shown  that  the  optimal  stopping  rule  for  a  normal 
price  distribution  with  known  mean  and  unknown  variance  is  not  myopic, 
because  observing  a  very  low  offer  increases  the  variance  estimate  and 
therefore  increases  the  likelihood  of  subsequently  observing  a  high 
offer. 


4 .  Unknown  Price  Distribution  with  No  Recall 
The  main  issue  here  is  when  is  the  optimal  policy  a  reservation- 

price  policy.  Rothschild  [12J  has  examined  the  multinomial  distribution 

with  a  Dirichlet  prior  and  found  the  optimal  policy  to  be  reservation. 

Our  result  is  more  general  in  that  it  is  not  specific  to  a  particular 

family  of  distributions;  however,  it  does  not  cover  the  multinomial/ 

Dirichlet . 

Definition  Given  a  vector  of  observed  offers,  let  a  sufficient  offer 
be  an  offer  which,  if  observed  next,  would  cause  a  seller,  following  an 
optimal  policy,  to  stop  and  sell.  Define  an  insufficient  offer  similarly. 


4 
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If  the  indicated  action  is  strictly  preferred  over  the  alternative, 
we  call  the  offer  strictly  sufficient  or  insufficient. 

Consider  two  sellers  who  have  received  almost  identical  sequences 
of  offers  (all  but  one  offer  identical) ,  and  who  started  with  identical 
priors.  Suppose  that  the  seller  who  received  the  higher  of  the  non¬ 
identical  offers  has  a  next-offer  distribution  which  puts  more  mass 
on  high  offers  than  that  of  the  other  seller,  and  suppose  also  that 
the  expected  gain  from  one  more  offer  of  this  seller  is  not  too  much 
greater  than  that  of  the  other  seller.  This  roughly  describes  a 
condition  under  which  a  reservation-price  policy  will  prevail. 

Theorem  2.  Suppose  that  for  all  N-component  vectors  of  observed 
offers  p_  ^  q.  which  differ  in  exactly  one  component, 

F(x|p)  j<  F  (x|q}  for  all  x  , 

and 

G(x|q)  G(x  +  A/N|p)  for  all  x 

where  A  denotes  the  positive  component  in  p.  -  q. .  Then 

(i)  the  difference  in  the  sellers '  expectations  of  the  next 
offer  observed,  Z  ,  is  bounded  by  0  and  A/N  ,  i.e., 

0  <_  E[Z  jpj  -  E  [ Z  |  qj  <_  A/N  .  (4. 

(ii)  For  all  t  (including  t  ■  «)  ,  the  differences  in  the 

sellers'  pre-posterior  expectations  of  the  value  of  con¬ 
tinued  search  is  bounded  by  0  and  A/N,  i.e., 
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0  1  B  [Vt_i  (D-.Z)  ]  -  E  [Vt-1  (a.,Z)  ]  £  A/N  .  (4.2) 

In  particular,  if  pN  and  are  insufficient  offers 

for  p^ . PN_1  and  ,  . . .  ,  q  x  ,  respectively, 

then 

0  £  Vt(jD  -  Vt(4)  £  A/N  .  (4.3) 


(iii)  The  optimal  policy  is  a  reservation-price  policy. 


Proof.  For  any  random  variable  E[Y]  *  I  F(y)dy 

'o 


F(y)dy.  Thus 


f 

E[z|bJ  -  E[z|qj  -  A/N  *  lim  /  [F(p|ji)  -  F(p|a)]dp  -  A/N  (4.4) 

T  -*■-  ®  Jr 


£  lim  sup [  /  F(p|ji)dp  -  /  F(p|q-)dp] 
T-*-  -®  Jx  Jl- A/N 


£  lim  sup  [G(T|p.)  -  G(T  -  A/N  I4)  ]  £  0 
T-+  -  <*> 


This  and  the  stochastic  dominance  of  F(x|p.)  over  F(x|x^)  establishes 

(i). 


The  other  finite-horizon  conclusions  of  the  theorem  will  be 
established  by  induction  on  the  number  of  periods  remaining,  using  the 
recursive  formula 


vt(*0 


maxjpN,- 


c  + 


vt_i(l».,p)dF(P|{0} 


(4.5) 
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For  the  one-period  problem,  (ii)  follows  from  (i)  since  Vg(p.,p)  ■  p  . 
To  establish  (iii) ,  let  x  be  an  insufficient  offer  and  y  a  strictly 
sufficient  offer  for  p. .  (If  x  or  y  do  not  exist,  the  reservation 
price  is  trivial.)  From  the  recursion  (4.5), 


Since 


VqCr.Z)  -  Z 
E 


’Vs-)[vo  1  * y  +  0 

Wvo  (a'lt’z)  1  -* +  c 


for  all  Z  ,  these  inequalities  imply 


(b.» y)  ^  ~  e(ji..x) 


[z] 


<  y  -  x  . 


(4.6) 

(4.7) 


If  y  <  x  this  contradicts  the  right-hand  inequality  of  (4.1),  which 
has  already  been  established.  Thus  y  ^  x  ,  and  the  optimal  one-period 
policy  is  reservation.  Now  assume  (ii)  and  (iii)  hold  for  the  t-period 
problem,  by  the  induction  hypothesis  Vt(q.,Z)  £  Vt(p.,Z)  .  Thus  by 
stochastic  dominance 


E  [V  Cq-,Z)  ]  <E  [V  (fl.,Z)]  <  E  [V  (j2.,Z)  ]  • 
q-t  E.  t  ~  2-  t 

This  establishes  the  nonnegativity  in  (ii).  Let  P,Q  be  the  t-period 
reservation  prices  for  ,  respectively.  Then 


E  [V  (b.  ,  Z)]  -  E  [V  U.,  z)l 

D  t  ii.  *■ 


pf (p |a) dp  - 


pf(p|a)dp  +  vt  (u.,p)  f (p |a)dp 


V  (q,p)  f(pta)dp  • 


It  follows  from  the  induction  hypothesis  on  (ii)  that  P  >  Q  . 


After  some  algebraic  manipulations,  the  above  becomes 
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P[f(p|fi)  -  f(p|n)]<ip  + 


Vt  (a.p)  [f  (p  Ibl)  -  f(p|q)]dp 


(4.8) 


[Vt ClL.P )  -p]f(p|fijdp  + 


[Vt  (jup)  -Vt  (a,p)  [f(p|n)dp 


After  integration  by  parts  the  first  two  terms  become 


[F(p|pi)  -  F(p  |  q)  ]dp  + 


A 


3/3pfVt  (q-,p)  ]  [F (p  [^a.)  —  F (p  [ xj.)  ] dp 


By  the  induction  hypothesis  3/3p[Vt  (<j.,p)  ]  £l/(N  +  l)  for  p  _<  Q  , 
so  using  (4.1),  the  first  two  terms  of  (4.8)  are  bounded  above  by 


[1-MN+D] 


[F  (p  I-P-)  ~  F(p  |.q.)  ]dp  +  [E  [  z|p-]  -E  [  z  |<t]  ]  /  (N+l) 


1  [G(Q|p.)  -  G(Qk)]N/(N+l)  +  A/(N2  +  N) 


Next  we  establish  a  bound  on  the  third  term  of  (4.8).  For  Q  £  p  £  P  , 


V  (p.,p)  -  p  £  v  (a,Q)  -Q  +  (p  -  Q)  [  max  {3/9p[V  CE,p)}  -  1  ] 

Q<p<P 


Since  Vt(flL,Q)  -  Q  =  Vt(*,Q)  -  Vt(*,Q)  <  A/  (N+l) 

and  3/3p  [V  (51, p)  ]  £  1/(N+1)  for  Q  £  p  £  P  by  hypothesis. 


Vt(*,p)  -  p  £  A/ (N+l)  -  (p-Q)N/ (N+l)  Q  £  p  £  P  (4.9) 

Evaluating  (4.9)  for  p  -  P  and  noting  that  VtCa,P)  =  P  yields  an 
upper  bound  on  P  of  Q  +  A/N  .  Thus 


[Vt(p.,p)  -  p ]  f  (p  |fljdp  £  (N+l) 


-1 


Q+A/N 


[A  -  pN  +  QN]f  (p  |pt)dp  . 
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Integrating  the  right-hand  side  by  parts  yields  an  upper  bound 

^(QluJ A/ (N-t-1)  -  [G(Qlp)  -  G(Q  +  A/N|p) ]N/ (N+l)  . 

The  last  term  of  (4.8)  is  bounded  by  F(Q|p.)A/(N+l)  .  Combining  these 
bounds  on  the  terms  of  (4.8)  yields 

yvt(*,z)]  -  Ei[VtU.,Z)3  <  A/N  +  [G(Q+A/N|p)  -  G(Q  |  q)  ]N/  (N+l)  . 

The  second  term  of  the  right-hand  side  is  nonpositive  by  hypothesis, 
and  so  (ii)  is  established  for  the  (t  +  1) -period  problem.  The  proof 
of  (iii)  for  the  (t  +  1)  -  period  problem  follows  exactly  as  in  the 
one-period  case  by  use  of  (ii). 

For  t  =  <*>  ,  conclusion  (ii)  follows  as  a  limiting  case  of  the 
finite-horizon  result.  Conclusion  (iii)  then  follows  as  before.  0 
If  the  next-offer  distribution  given  a  history  of  offers  p.  is 
merely  a  shifted  version  of  that  given  fl. ,  and  if  the  shift  is  between 
0  and  A/N  in  magnitude,  then  the  conditions  of  Theorem  2  will  hold. 
Thus  distributions  with  unknown  location-  or  mean-related  parameters 
might  be  expected  to  have  optimal  policies  which  are  reservation. 

Example  4.1  Normal  Distribution  with  Unknown  Mean 

Let  Z  -  ^*(^,1)  with  p  an  unknown  parameter  with  prior 
JVXlig,  1/t)  .  After  observing  p. ,  the  posterior  on  u  is  Jf(v^,l/  (t+ N)) 
and  the  posterior  on  Z  is  J*f(yN»  1  +  1/(t+N))  ,  where 

UN  “  (t  U  q  +  l  P±)/(t  +  N)  . 
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Since  the  posterior  of  Z  given  4.  is  -  A/ (t+N)  ,  1  +  1/(t+N))  , 

F(xU)  -  F(x  +  A/ (t  +  N)  jp) 


Thus  the  hypotheses  of  Theorem  2  are  satisfied. 


Example  4.2  Exponential  Distribution  with  an  Exponential  Prior 


Let  X  have  an  exponential  distribution  F(x|X)  =  1  -  e 


-Xx 


aoe 


-  a  X 
0 


where  X  is  an  unknown  parameter  with  an  exponential  prior  density 

Let  the  next  offer  be  Z  =  M-X  .  (M  is  the  maximum 
offer  possible  and  is  presumed  known.)  The  posterior  density  of  X 
given  a  history  of  offers  p  is 


h(X|p) 


,  .  N+l . N  , 

<V  x  "V 

-  e 

N  ! 


where  +  J/M-p^)  .  The  posterior  distribution  of  Z  is 

F(zjp)  = 


N 

a„+M-z 


N+l 


for  jl  _>  n  ,  F(z|p)  £  F(z|p)  ,  so  the  first  hypothesis  of  Theorem  2  is 
satisfied.  To  verify  the  second  hypothesis,  note  that 


M 


G(x|p) 


1- 


N 


aN 


N+l 


(V 


N+l 


,+M-zj  dz  "  N(aN+M-x)N  +M_ctN/N_X 


Let  k  be  the  index  of  the  nonzero  coordinate  of  Then 


G(x|p)  -  G(x-A/N  I  q)  -  dCp^)  "*  d^qk^  ’ 


where 
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r 


and 


d(y)  ” 


(Kl  +  M-y) 


N+l 


N(K2  +  (M-y) (N+l) /N) 


(M-P±)  , 


K2  =  K1  +  M-  x-(M-  Pk)/N 


Since  d'(y)  £  0  for  y  £  Pk  .  d(pfc)  “  d(qk)  1  0  »  and  so  the  second 
hypothesis  of  the  theorem  is  satisfied. 


m 


I 


22 


5-  Geometric  and  Random  Offer  Rates 

The  preceding  results  assumed  a  periodic  offer  rate.  However, 

those  results  involving  the  multinomial  distribution  with  a  Dirichlet 

prior  can  be  applied  to  the  case  of  a  geometric  offer  rate  by  adding 

a  parameter  p^  *  0  to  describe  a  null  offer  and  a  parameter  N^  to 

describe  the  frequency  of  periods  with  no  offer. 

Under  certain  conditons  the  preceding  results  can  be  applied  when  the 
offer  rate  is  random.  Let  H(«)  be  the  distribution  of  times  between 

offers  under  a  random  offer  rate  assumption.  Let 

.00 

(5-w)dF(5) 

P(w)  *  — - 

1  -  H(w) 


Note  that  y(0)  is  the  mean  time  between  offers,  and  u(w)  is  the 
expected  remaining  time  until  the  next  offer  given  the  last  offer  was 
received  w  units  of  time  ago. 

Let  VtCa)  be  the  maximum  expected  net  return  given  a  history 
of  offers  p. ,  a  maximal  number  t  of  offers  which  can  be  considered 
in  the  future,  and  given  that  an  offer  has  just  been  received.  Sup¬ 
posing  an  amount  of  time  w  has  passed  since  the  last  offer,  the 
decision  on  whether  or  not  to  accept  the  best  offer  still  in  force  is 
determined  by 


where 


max{y  (pj ,  -  c  y(w)  + 


vt_l(E.,p)ciF(p|2)}  , 


y  (b) 


maxfp^} 

PN 


recalled  allowed 

no  recall 
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We  now  show  that  if  H(«)  has  the  following  "aging"  property,  then  the 
only  possible  times  at  which  an  optimal  policy  will  stop  are  when  an 
offer  has  just  been  received. 

Definition  H(*)  is  new  better  than  used  in  expectation  (NBUE)  if  and 
only  if  y(w)  <_  y(0)  for  all  w  >_  0  . 

The  notion  of  new  better  than  used  in  expectation  originates  in 
reliability  theory.  It  constitutes  one  of  the  weakest  notions  of  aging 
of  physical  devices. 

Theorem  3.  If  H(*)  is  NBUE  ,  then  the  optimal  policy  never  stops 
in  between  offers. 


Proof .  Given  an  offer  has  just  been  received,  the  optimal  policy 
contiuues  if  and  only  if 


v(p)  <  -  c  y(0)  + 


vt_iCB.,p)dF(p|jO  . 


Given  w  units  of  time  have  passed  since  the  last  offer,  the  optimal 
policy  continues  if  and  only  if 


yCcJ  <  -  c  y(w)  + 


vt_iCp.»p)dF(p  Is) 


since  y(0)  >  y(w)  the  result  follows. 


□ 


By  Theorem  3,  all  the  results  of  Sections  3  and  4  can  be  extended 
to  the  case  of  a  random  offer  rate  with  NBUE  inter-offer  times  by 
replacing  c  by  c  y(0). 
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6 .  Conclusions 

When  the  offer  distribution  is  unknown,  the  information  obtained 
from  previous  offers  can  influence  the  distribution  of  the  next  offer 
in  a  very  elaborate  fashion.  Thus  it  is  not  surprising  that  the 
optimal  policies  are  generally  more  complex  than  in  the  case  of  a 
known  distribution.  The  conditions  we  have  given  which  guarantee 
optimal  policies  which  are  no  more  complex  than  in  the  known  distri¬ 
bution  case  are  clearly  restrictive;  they  would  fail  to  hold  for  most 
families  of  distributions.  However,  as  we  have  shown,  there  are 
important  cases  where  the  optimal  policies  will  retain  the  same 
simplicity  and  properties  of  the  known  distribution  case. 

Future  research  in  this  area  should  perhaps  consider  optimization 
among  a  smaller  class  of  policies  which  are  most  intuitive  and  easily 
implemented,  and  might  seek  to  develop  bounds  on  the  loss  resulting 
from  such  a  policy  restriction.  The  work  by  Derman,  Lieberman,  and 
Ross  [5]  is  a  step  in  this  direction.  Another  area  of  investigation 


should  be  random  offer  rates  when  the  NBUE  condition  does  not  hold. 
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- A  number  of  authors  have  established  the  properties  of  optimal  selling 

policies  when  the  distribution  of  offers  is  known  and  the  offers  are  received 
periodically.  This  report  investigates  the  conditions  under  which  these  same 
properties  hold  for  an  unknown  offer  distribution  which  is  updated  as  succes¬ 
sive  offers  are  received. 

The  selling  problem  has  strong  similarities  but  also  important  differences 
with  the  problem  of  purchasing  a  commodity  subject  to  an  unknown  price  distri¬ 
bution,  and  both  arise  lii  situations  other  than  buying  or  selling  an  asset. 

Some  applications  of  the  model  in  quality  assurance  and  other  settings  are 
briefly  discussed. 
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